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(54) Methods for protein analysis using protein capture arrays 



(57) The invention relates to a method allowing 
identification and/or quantifying and/or characterizing 
proteins in a protein mixture, wherein the proteins are 
stratified on feature(s) on an array, and a procedure, 
preferably mass spectrometric analysis, is applied on 
the proteins on thefeature(s), allowing determination of 
the nature and quantities of the proteins. In particular, 



the method allows the comparative analysis of nature 
and amount of proteins in at least two samples. It also 
allows the targeted selection of proteins out of a mixture 
of proteins. It further identifies three-dimensional struc- 
tures that can interact with a selected target protein or 
a modification of said protein. 
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[0001] The invention relates to a method allowing 
identification and/or quantifying and/or characterizing 
proteins in a protein mixture, wherein the proteins are 
stratified on feature(s) on an array, and a procedure, 
preferably a mass spectrometric analysis, is applied to 
the proteins on the feature(s), allowing determination of 
the nature and quantities of the proteins. In particular, 
the method allows the discriminative analysis of inter- 
actions of proteins with a three-dimensional structure. It 
also allows the targeted selection of proteins out of a 
mixture of proteins. It further identifies three-dimension- 
al structures that can interact with a selected target pro- 
tein or a modification of said protein. 
[0002] The term "target" refers to a protein molecule 
that has an affinity for a given compound on a feature. 
A target can be employed in its unaltered state (prefer- 
ably with no alteration of the 3-dimensional structure of 
the protein). Targets may also be modified. 
[0003] In preferred embodiments, they harbor a fluo- 
rescent or radioactive moiety, or groups or isotopes that 
can be identified by mass spectrometry. In specific em- 
bodiments, targets are labeled, wherein said labeling 
consists in a chemical modification of the proteins, pref- 
erably said chemical modification does not atterthe 3-di- 
mensional structure of the protein. 
[0004] In some embodiments, said chemical modifi- 
cation consists of attaching a chemical group chosen in 
the group consisting of trinitrobenzene sulfonic acid, 
ethylthiofluoro acetate, succinic anhydride, phenyli- 
sothiocyanate, Dansyl chloride, acetic anhydride, poly- 
ethylene glycol, and similar reagents to the (deprotect- 
ed) N-terminal group of the protein. 
[0005] In other embodiments, said chemical modifica- 
tion consists of inducing SH-specific protein modifica- 
tions with an agent chosen in the group consisting of 
mercatoethanol, dithiothreitol, iodoacetic acid, iodoa- 
cetamide, and the like. 

[0006] In yet another embodiment, said chemical 
modification consists of modifying carboxyl groups of 
proteins by full or limited amidation using an agent cho- 
sen from the group consisting of 1-ethyl-3-[3-(dimethyl- 
amino)propyl]carboiimide hydrochloride (EDC), an 
amine like glycine methyl ester, glycinamide, methyl- 
amine, ethanolamine and the like. 
[0007] In yet another embodiment, said chemical 
modification consists of a modification of tyrosines by 
nitration with tetranitro methane (also oxidation of thiols), 
or of tryptophans by specific oxidation, for example by 
N-bromosuccinimide, or by limited oxidation with ozone. 
[0008] A "feature" according to the invention is de- 
fined as an area of a substrate having a collection of 
same-nature, surface-immobilized molecules. One fea- 
ture is different than another feature if the molecules of 
the different features have a different structural formula 
and/or 3-dimensional conformation. 
[0009] The term "array" refers to a substrate having a 



two-dimensional sufflCe having at least two different 
features. Arrays are preferably ordered so that the lo- 
calization of each feature on the surface is defined. In 
preferred embodiments, an array can have a density of 

5 at least five hundred, at least one thousand, at least 1 0 
thousand, at least 100 thousand features per square 
cm. The substrate can be, merely by way of example, 
glass, silicon, quartz, polymer, plastic or metal and can 
have the thickness of a glass microscope slide or a glass 

10 cover slip. Substrates that are transparent to light are 
useful when the method of performing an assay on the 
chip involves optical detection. The substrate may also 
be a membrane made of polyester or nylon. In this em- 
bodiment, the density of features per square cm is com- 

15 prised between a few units to a few dozens. 

[0010] Preparation of arrays and features is described 
below. 

[0011] The term "distinguishable phenotype" has to 
be understood as a phenotype (i.e. a qualitative or quan- 
go titative measurable feature of an organism) that can al- 
low the categorization of a given population. For exam- 
ple, a distinguishable phenotype encompasses the 
membership to a set of a given disease, or a peculiar 
feature or property (e.g. resistance or adverse effect 

25 when given a drug). 

[0012] The most important of the genome projects, 
the complete sequence of the human genome, has re- 
cently been finished. This project revealed the complete 
sequence of the 3 billion bases and the relative positions 

30 of all estimated 30-40.000 genes in this genome. The 
genes are translated into a far larger number of proteins 
for example by differential splicing, and the proteins can 
in addition be post-translationally modified, for example 
by the formation of disulfide bridges between cysteins 

35 or phosphorylation of amino acid side chains. Addition- 
ally protein expression can be up- and down-regulated 
depending on the status of a cell. 
[0013] Fu rther variations of protei ns can be expected 
based on DNA sequence variations from one individual 

40 to the next. One branch of genomics termed genotyping 
is focusing on the assessment of genomic sequence 
variation for the attribution of causative gene variants. 
Genomic sequence is static and thus does not allow the 
determination of the point of onset of a genetic disease 

45 without knowing the real correlations in a cell. 

[0014] The analysis of levels of expression of RNA 
transcripts is more indicative. Up-and down regulation 
of mRNA matrices for protein synthesis is detected giv- 
ing an indirect hint about the protein level in the cell. 

50 Although quantification can be done the real quantity of 
proteins remains uncertain and secondary modifica- 
tions cannot be elucidated. Further, RNA is less stable 
than DNA and thus more difficult to handle and normal- 
ization of RNA levels does pose problems. Oligonucle- 

55 otide arrays have reached great popularity for expres- 
sion analysis (Duggan et ai. Nat.Genet. 21 (Suppl.), 
10-13 (1999)). The RNA pool of control cells is tagged 
with one fluorescent dye, while the RNA pool of cells 
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deriving from cases is taffiGwith a different dye. Both 
pools are simultaneously hybridized to one array and by 
comparison of the emitted fluorescence of the two dyes 
quantification is achieved. A number of review articles 
dealing with RNA and array technologies in general 
were published in a supplement of Nature Genetics in 
January 1999. 

[0015] Moving to next level, the study of all proteins 
of a cell termed proteomics has reached great popularity 
because it directly analyses the protein status and thus 
the active components of a cell. A proteome has been 
defined as the protein complement expressed by the ge- 
nome of a cell or an organism. Although the real prob- 
lems might be tackled by proteomics suitable methods 
that give a global high resolution overview are currently 
not available. Additionally to secondary modifications 
the most interesting processes are at low-level regula- 
tion of gene expression and are linked to changes from 
no copy per cell to very few copies. Both on the RNA 
and protein level these are currently hard if not impos- 
sible to detect. 

[0016] Proteomics has mainly been advanced 
through the application of mass spectrometry (Karas 
and Hillenkamp, Anal. Chem. 60, 2299-2301 (1988); 
Fenn et al., Science 246, 64-71 (1989)). In proteomics 
matrix-assisted laser desorption/ionization mass spec- 
trometry (MALDI) is used to analyze the product mixture 
of proteins digested by trypsin. The detected masses 
give a fingerprint that on comparison with a database 
allows the identification of the proteins. 
[0017] When the fingerprint of the protein is not 
known, it remains possible to identify the protein by se- 
quencing the different peptides, for example using Elec- 
tion Spray Ionization mass spectrometry (ESI) 5 or 
through conventional methods. Digestion of the protein 
by various proteases followed by identification of the 
mass and sequence of the peptides allows the determi- 
nation of the whole sequence of the protein. These 
methods are well known by the person skilled in the art 
(Jenkins and Pennington (Proteomics 1, 13-29 (2001), 
Siuzak (Proc Natl Acad Sci USA 1994; 91(24): 
11290-7)). 

[0018] However, great interest lies in an extensive 
analysis of proteins contained in a sample, in particular 
a biological sample, for example a bodily fluid, or a sam- 
ple harvested from a particular organ, especially a tu- 
mor. Several tens of thousands or even hundreds of 
thousands of proteins could be in such a fluid, or organ. 
[0019] It is also interesting to perform comparative 
analysis between two or more samples, in order to de- 
termine the difference in the protein contents between 
said samples, especially when one sample is originating 
from normal state and another is originating from a path- 
ological state. Comparison of the proteins within these 
two samples would give a protein fingerprint that is spe- 
cific of the studied pathological state. This fingerprint 
could thus be used in a diagnosis process, and may al- 
low early identification of the pathology before appari- 
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tion of the clinicaTsigns. 
[0020] Comparative analysis may also be very inter- 
esting in order to determine the responsiveness of a tar- 
get patient to a test or treatment. This would allow to 
better adapt the treatments to the patient, something ex- 
tremely interesting in cancer cure. For example, an anal- 
ysis is performed in order to identify the proteins that are 
qualitatively or quantitatively differentially expressed fol- 
lowing treatment of a patient responsive to said test or 
treatment, using the method of the invention, as will be 
described below. A fingerprint of "responsiveness to 
treatment" is thus obtained. Then, an analysis of biolog- 
ical samples issued from said target patient before and 
after start of the test or treatment is then performed, and 
the match to the fingerprint allows to deduce the respon- 
siveness of said target patient to said test or treatment 
by the presence of said proteins identified in the first 
step. 

[0021 ] In practice of the state-of-the-art extracted pro- 
teins are separated on two-dimensional gels. The first 
separation dimension is achieved by isoelectric focuss- 
ing with a pH gradient, the second by size (Klose Meth- 
ods Mol. Biol. 112, 147-172 (1999)). The gel is then 
stained with Coomassie or silver. Usually the detection 
sensitivity allows the identification of a few thousand 
spots per gel. Spots are excised, digested with a pro- 
tease (e.g. trypsin) and analysed by MALDI (Karas and 
Hillenkamp Anal. Chem. 60, 2299-2301 (1988)). in or- 
der to increase the efficiency of the tryptic digestion, 
processing of samples in very small volumes can be 
done (Eickhoff et al. WO 01/26797 A2). Further informa- 
tion about the peptides can be obtained by sequencing 
using the post source decay mode of a MALDI mass 
spectrometer, or ESI mass spectrometer. 
[0022] Unfortunately the detection thresholds of the 
gel-staining methods do not allow the detection of a 
large part of the proteins present and low level proteins 
are masked by proteins of high abundance. Only several 
thousand proteins are usually identified on a gel, which 
probably represents only a few percent of the total pro- 
teins present. The majority of detected proteins are 
housekeeping genes with little diagnostic interest and 
impact. Additionally as the proteins are only separated 
by two properties (size and pi) the resolution of 2-D gels 
is not high enough. With 2-D gels quantification is prac- 
tically impossible. Analysis by means of 2-D gels is fea- 
sible particularly for proteins of a size from 30kD-80kD. 
Another major drawback is that only soluble proteins 
can be separated with gels. Membrane proteins that are 
of greatest physiological interest are impossible to sep- 
arate on gels. A comprehensive review is given by 
Jenkins and Pennington (Proteomics 1 , 13-29 (2001)). 
[0023] A complementary technique of 2-D gels 
termed SELDI (surface enhanced laser desorption/ion- 
isation) was developed recently (US 5719060, US 
6020208; EP0990256; EP0990257; EP0990258). The 
procedure is based on chromatographic procedures fol- 
lowed by mass spectrometric analysis (Siegel J. Mass 
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Spectrom. 33, 264-273 (1 998J?vVashburn Nat. Biotech- 
nol. 242-247 (2001 )). Thereby proteins of a mixture are 
bound to several unspecific features including hydro- 
phobic surfaces, ionic surfaces etc. and subsequently 
washed using different conditions so that only a part of 5 
the proteins remains on a feature. By this procedure the 
complexity of a mixture is reduced as proteins are allo- 
cated to a certain position. For MALDI analysis matrix 
is applied onto the features and the proteins are ana- 
lyzed in a time-of -flight mass spectrometer. This method 10 
is feasible for small proteins not exceeding a mass 
range of 30 kD. It is the belief of the authors that quan- 
tification is not feasible by SELDI as generally only small 
and broad signal peaks are obtained. Secondary mod- 
ifications cannot be analyzed. 15 
[0024] 2-D gels or SELDI are complementary tech- 
niques concerning the size of the proteins with the ad- 
vantage of SELDI being more powerful as more than 
two dimensions are used forseparation of proteins (WO 
98/59360; WO 00/66265; WO 00/67293). However both 20 
methods are not apt for quantification and as whole pro- 
teins are measured the resolution of signals is not high. 
An unambiguous identification of proteins is impossible. 
Furthermore the analysis of secondary modifications of 
all proteins on a sample is difficult if not impossible. 25 
[0025] Several procedures for the quantification of 
proteins have been described (Zhou etal. Nat. Biotech- 
nol. 19, 375-378 (2001); Oda et al. Nat. Biotechnol. 19, 
379-382 (2001 ); WO 00/1 1 2208). However, these meth- 
ods are not suitable for a survey of the complete protein 30 
load of a cell. They were rather developed to enrich cer- 
tain proteins. Specific reagents containing a reactive 
group to tag defined chemical functions of an amino acid 
of peptides deriving from a tryptic digestion are used. 
The tags contain a linker and a binding group, generally 35 
consisting of biotin, that can be used for separation on 
streptavidin-coated magnetic beads. The linker is fur- 
ther used for introducing isotopic labels. Similarto RNA 
quantification, proteins captured of control cells are 
tagged with a molecule containing a linker with one kind 40 
of isotope, while the proteins of cells derived from cases 
are tagged with molecules containing another isotope. 
As two fluorescent emissions are used for quantification 
of RNA arrays, the signal intensities of corresponding 
peptides or proteins in the mass spectrum are compared 45 
for quantification, using additionally internal and exter- 
nal standards. In a variant of this approach, control and 
case cells are fed with different isotopes of nitrogen, so 
that the proteins of case and control cells are distin- 
guishable by comparison of signal heights in mass spec- so 
tra (Oda et al. Proc. Natl. Acad. Sci. USA, 96, 6591 -6596 
(1999)). 

[0026] An alternative methodfor protein quantification 
could be the use of protein arrays and detection and 
quantification of binding using surface plasmon reso- 55 
nance (SPR) analysis. This is an optical detection sys- 
tem that was developed recently by Biacore (www.bi- 
acore.com). This method is also described in combina- 



tion with arrays of c^WPTical libraries (DE 19923820; DE 
10008006; DE 19920156). This method has also been 
used in conjunction with subsequent mass spectromet- 
ry analysis of affinity bound samples (Nedelkov and 
Nelson : International Laboratory, 31 (6), September, 
8-15(2001). 

[0027] A further method for quantification uses a lu- 
minescent or radioactive substance, an enzyme or a 
metal containing substance for quantification of antibod- 
ies or antigens (US 4020151). 

[0028] Electrospray ionization mass spectrometry 
(ESI) is another method used to characterize proteins 
(Fenn et al. Science 246, 64-71 (1989)). In general re- 
cording a spectrum in ESI is slower than MALDI, yet 
gives higher resolution. Like MALDI, ESI can be used 
to generate sequence information of peptides. Peptides 
are sequenced by using the collision induced fragmen- 
tation of the peptides in the mass spectrometer. 
[0029] Apart from these two ionization methods 
(MALDI and ESI) for volatilization of biomolecules, huge 
progress has been made in recent years in terms of sep- 
aration of ions and analysis thereof in the mass spec- 
trometer. The main developments were made in the use 
of alternative ion extraction procedures in MALDI, and 
the applications of quadrupols and ion traps to isolate 
specific ions in ESI, reflectrons to increase resolution 
and the usage of orthogonal set-ups to pulse ion pack- 
ages into the mass spectrometer. State-of-the-art mass 
spectrometric analysis allows virtually any combination 
of MALDI and ESI with any separation and analysis 
method (reflectrons, time-of-flight analysis,...). 
[0030] By fabricating microarrays of small molecules 
(prepared by split-and-pool synthesis), large libraries of 
compounds can be screened very efficiently to identify 
new ligands for virtually any protein of interest (Schreib- 
er Science 17, 1964-1969, (2000)). Such ligands can 
then be used to study the biological role of its protein 
target by perturbing its function in vivo. 
[0031] Protein arrays are becoming a reality (WO 
00/54046). Recent advances in protein array technolo- 
gy are described in the several publications (Kodadek, 
Chem. Biol. 8, 105-115(2001); Haabet al., Genome Bi- 
ology 2, 0004.1-0004.13 (2001); Zhu and Snyder, Cur- 
rent Opinion in Chemical Biology, 5, 40-45 (2001); 
Fields, Science 291, 1221-1224(2001); MacBeath and 
Schneider, Science 289, 1760-1763 (2000); Emili and 
Cagney, Nature Biotechnology 18, 393-397 (2000); 
Walter etal., Current Opinion in Microbiology 3, 298-302 
(2000); Arenkov et al., Analytical Biochemistry 278, 
123-131 (2000); Holt et al., Current Opinion in Biotech- 
nology 11, 445-449 (2000); Roda et al., Biotechniques 
28, 492-496 (2000); Lueking et al., Analytical Biochem- 
istry 270 : 103-111 (1999); WO 00/29444; WO 
00/04382). 

[0032] For protein arrays, proteins are expressed and 
attached to a surface of a glass slide or other support in 
an arrayed pattern. In conjunction with high throughput 
expression and purification of recombinant proteins, 
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microarrays of functionaH^active proteins were pre- 
pared on glass slides. These arrays are then used to 
identify protein-protein interactions, to identify, for ex- 
ample, the substrates of protein kinases, or to identify 
the targets of biologically active small molecules. 
[0033] While transcriptional profiling provides invalu- 
able insight into biological function on a genome-wide 
scale, it does not offer information on regulation that oc- 
curs at the protein level (e.g., degradation, phosphor- 
ylation/dephosphorylation, sub-cellular localisation, 
etc.). The possibility of using microarrays of antibodies 
to study regulation at the protein level is under investi- 
gation (de Wildt et al. Nat. Biotechnol. 18, 989-994 
(2000)). Polyclonal antibodies can be produced by initi- 
ating an immunological reaction of an animal caused by 
high abundance of the protein of interest. 
[0034] In principle, libraries of proteins or peptides de- 
riving from phage display, ribosome display or any other 
method to create libraries of proteins and peptides can 
be spotted onto a surface for subsequent binding of pro- 
teins (Li et al. Nat. Biotechnol. 18, 1251-1256 (2000); 
Kay et al. Methods 24, 240-246 (2001); Holt et al. Curr. 
Opin. Biotechnol, 11, 445-449 (2000)). 
[0035] The construction of arrays by molecule librar- 
ies is not restricted to poly amino acids or organic mol- 
ecules but can also be done by nucleic acids such as 
RNAs analogously to RNA arrays. However, the gener- 
ation of specific addresses for protein binding is not that 
easy. Nucleic acids on RNA expression arrays bind, 
obeying the rules of Watson -Crick base-pairing. For 
specific protein binding suitable RNAs have to be found 
out in a selection process termed SELEX (Sun Curr. 
Opin. Mol. Then, 1 00-1 05 (2000); Jayasena Clin. Chem. 
1628-1650 (1999); Doi and Yanagawa Comb. Chem. 
High Throughput Screen 4,497-509 (2001), WO 
99/27133). 

[0036] Gel-pad based microarrays are described in 
several publications (US 5981734; US 6143499; US 
5770721 ; US 5756050) 

[0037] Nanoelectrode arrays are also a possible so- 
lution for separation of protein mixtures on a chip. Three- 
dimensional electrochemical binding profiles, which 
mimic traditional chemical binding sites, are applied (US 
6123819) to capture specifically a protein. 
[0038] Other relevant publications to the state-of-the- 
art are WO 00/61806; WO 00/54046; US 4020151). 
[0039] The state-of-the-art of protein chemistry and 
methods for protein analysis is described in "Proteome 
and protein analysis", ed. Kamp, Springer Verlag, ISBN 
3-540-65891-2 (2000) and "Proteins Labfax", ed. Price, 
BIOS Scientific Publishers, ISBN 0-12-564710-7 
(1996). 

[0040] Proteomics (systematic analysis of proteins) 
suffers from the severe limitation that with 2-D gel anal- 
ysis and subsequent mass spectrometry analysis of 
tryptic digest products only very abundant proteins, that 
are of limited interest, can be analysed. Protein arrays 
on the other hand, at the current state-of-the-art, are dif- 
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ficulttoproduce^TTh high variability and resolution. Cur- 
rently no method exists to generate a protein array with 
high variant coverage, a possibility of normalizing it, a 
method to analyze it with high resolution, thus providing 
5 a method for high resolution, selective protein analysis 
which can also be applied to the analysis of low abun- 
dance proteins. Another major problem of protein anal- 
ysis is, that no possibility exists to analyze two or more 
complete protein extracts simultaneously on one anal- 
10 ysis device, thus eliminating the variability between two 
analysis devices. 

[0041] In contrast to expression profiling the reaction 
sequence of a state-of-the-art protein analysis experi- 
ment lays itself significantly more open to experimental 

f5 variation and two situations are never directly compara- 
ble. In protein analysis, two protein extracts are not an- 
alyzed in the same experiment at the same time. Differ- 
ent samples are dealt with sequentially. 
[0042] The invention provides a method of analysis 

20 proteins that may be used to allow simultaneous analy- 
sis of two or more complete protein extracts on the same 
analysis platform, with complete resolution of relative 
protein identities, as well as analysis of post-translation - 
al modifications and quantities. 

25 [0043] This invention relates to a method for protein 
analysis. The operating medium of the method is a cap- 
ture array. This capture assay provides a means for 
stratification of the protein extract (separation of the pro- 
teins according to some of their structural features) and 

30 later a support for the subsequent treatment. 

[0044] Thus, the invention relates to a method for 
identifying and/or quantifying and/or characterizing mul- 
tiple proteins in a sample containing proteins, said meth- 
od comprising: 

35 

a) optionally labeling the proteins in the sample with 
a marker, 

b) bringing the proteins into contact with an array 
comprising one or more feature(s), leading to spe- 

40 cif ic capture of different proteins on different feature 
(s) on the array, 

c) applying, to the protein(s) captured on the feature 
(s) of the array, a procedure giving a fingerprint spe- 
cif ic of the protein(s) on the feature(s), the compar- 

45 json of the data obtained for the protein(s) on the 
feature(s) with a fingerprint library (database) allow- 
ing identification and/or quantification and/or char- 
acterization of the proteins present in the sample, 
including the post-trans I ati on al modifications. 

50 

[0045] In a specific embodiment of the invention, the 
composition or sequence of the proteins in said biolog- 
ical sample may be at least partially unknown, and the 
identification and the characterization of the unknown 
55 protein that can not be performed by comparison with 
databases is performed by further sequencing of the 
proteins or peptides, in particular by mass spectrometry. 
[0046] Preferably, the capture of the proteins by the 
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feature(s) on the array depenoTon the structure of the 
proteins, in particular the primary structure of the protein 
(sequence of the protein), but preferably on the 3-di- 
mentional structure of the protein. 

[0047] In a specific embodiment, the quantification of 5 
the proteins is performed by adding, to the sample, a 
specific and quantified protein as a standard, the quan- 
tification of the proteins in the sample being calculated 
by comparison with the standard. The absolute quanti- 
fication of the proteins is obtained from their relative 10 
weight with regard to the quantity of the standard pro- 
tein. It is calculated from the intensity of the signals ob- 
tained, constitutive of the fingerprint. 
[0048] In the most preferred embodiment, the finger- 
print is peptide-based. 15 
[0049] Preferably, in this embodiment, the proteins 
are captured on the array, which allows stratifying the 
proteins from the starting protein mixture. The proce- 
dure that is subsequently applied, in order to obtain the 
fingerprint useful for the subsequent simultaneous iden- 20 
tification and quantification of the proteins comprises of 
breaking down the captured proteins into specific pep- 
tide fragments on the feature preferably followed by 
identification of the proteins by their peptide fingerprints 
by mass spectrometry. The breaking down of the pro- 25 
teins into specific peptides is preferably performed by 
digestion with a specific enzyme such as trypsin, that 
cuts the proteins at specific and well known amino acids. 
Starting from the protein databases such as the one on 
the NCBI web site (http://www.ncbi.nlm.nih.gov) orsuch 30 
as SwissProt, or the EMBL database, it is easy to sim- 
ulate digests of proteins with trypsin, and build a data- 
base linking trypsin-digest peptides and proteins. Soft- 
ware exists that simulate digests of proteins by various 
proteolytic enzymes or reagent cleavage (http://bioweb. 35 
pasteur.fr/seqanal/interfaces/digest.html). Databases 
exist that integrate whole sequence DNA translated into 
theoretical protein peptides fingerprints, such as Mowse 
or Mascot, distributed by Matrix Science (London, UK, 
www.matrixscience.com) 40 
[0050] Starting the expected fingerprint obtained from 
the analysis of the databases, it is possible to determine 
post-translational modifications, if some are present. In- 
deed, the theoretical mass of some peptides may be cal- 
culated if no post-translational modifications are 45 
present., and any differences between the theoretical 
mass and the observed mass implies the presence of a 
moiety on the peptide. The mass of said moiety is easily 
calculated and the use of databases recording the char- 
acteristics of the most common post-translational moi- so 
eties (for example glycosylation, or phosphorylation) 
may allow to determine its nature. 
[0051] The quantity of the proteins on the feature on 
the array are determined either on an relative scale (one 
compared to the other), or may be absolute, with the 55 
help of a standard peptide. Mass spectrometry indeed 
allows to correlate the quantity of a peptide and the in- 
tensity of the peak corresponding to said peptide. 
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[0052] Thus, it is^Rsible to compare different pep- 
tide fingerprints from different features of the array and/ 
or to compare peptides on one feature, by comparing 
their peak intensity. 

[0053] Both measurements (qualitative and quantita- 
tive) are preferably achieved by mass spectrometric 
analysis. 

[0054] The invention also allows the analysis and 
comparison of two or more protein samples in a single 
procedure. The invention relates to a method for identi- 
fying proteins that are qualitatively or quantitatively dif- 
ferentially expressed between at least two biological 
samples containing proteins, said method comprising: 

a) labeling the proteins in each sample with a differ- 
ent marker, with the optional possibility of not labe- 
ling the proteins in one sample, 

b) mixing together the proteins of all different sam- 
ples and bringing the mixture into contact with an 
array comprising one or more feature(s), leading to 
specific capture of different proteins on different fea- 
ture^) on the array, 

c) applying, to the proteins captured on the feature 
(s) of the array, a procedure allowing identification/ 
response of the markers, the differences in the data 
obtained for each marker allowing the identification 
of the proteins that are qualitatively and/or quanti- 
tatively differentially expressed between the differ- 
ent samples. 

[0055] In a specific embodiment, the composition or 
sequence of some of the proteins in said biological sam- 
ples may be at least partially unknown. The method of 
the invention allows nevertheless to determine that 
these unknown proteins are differentially expressed 
(quantitatively or qualitatively), and the final identifica- 
tion of the unknown proteins may be performed by se- 
quencing of the proteins, in particular by mass spec- 
trometry. 

[0056] The method can be used to assay exactly two 
different samples, or more than two different samples. 
[0057] In the preferred embodiment, the procedure al- 
lowing identification/response of the different markers 
used for the different samples comprises the digestion 
of the proteins on the feature(s). in particular by adding 
a protease or a cleavage reagent to the f eature(s) of the 
array, giving a digest mixture of the protein(s) that are 
localized on said feature. 

[0058] In the preferred embodiment, said procedure 
comprises analysis of said digest mixture by mass spec- 
trometry, and in particular, matrix-assisted laser desorp- 
tion/ionization is used to transfer the peptides into the 
mass spectrometer. 

[0059] The figures in the application explain the prin- 
ciple of the invention. When different tags are used on 
the different samples (and it is possible not to use a tag 
on one of the samples, when mass spectrometry is per- 
formed), the relative abundance of the tags is used for 
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analysis of the relative aflRPGance of the proteins. 
[0060] When mass spectrometry is used, different 
mass tags (described below) are used in the different 
samples. Upon digest of the proteins by proteases or 
cleavage reagents, some peptides will be labeled by the 
mass tag, while others will not (for example, if a mass 
tag specific of the N-terminus of the protein is used, only 
the N-terminal peptide will be labeled). Analysis by mass 
spectrometry will then lead to a spectrum such that the 
unmarked peptides originating from the proteins of all 
samples will lead to a single peak, while a discrimination 
will be observed for the labeled peptides, the increment 
between the peaks being equal to the difference in the 
mass tags. 

[0061 ] Analysis of the difference in the intensity of the 
peaks gives an immediate knowledge of the relative 
abundance of the target protein in each sample. By an 
abuse of language, the "intensity of the peaks specific 
of each mass tag marker" may be called "the intensity 
of the marker". 

[0062] In a preferred embodiment the proteins of a 
protein mixture or extract are stratified on a capture ar- 
ray by binding to structural elements (features). These 
features are preferably attached to the surface of a car- 
rier that can be brought into contact with the full protein 
extract. Preferably the features are made up of mole- 
cules or combinations of the molecules of the following 
list: nucleic acids, oligonucleotides, oligopeptides, 
polypeptides, antibodies, oligosaccharides, polysac- 
charides, organic molecules, polymers and inorganic 
molecules. In order to generate a diverse library of fea- 
tures different approaches may be applied. Features are 
applied to significantly reduce the complexity of a pro- 
tein mixture by binding only one or a few proteins per 
feature. In contrast to the major part of array technolo- 
gies no selective binding (one feature one protein) is re- 
quired. 

[0063] In the case of nucleic acid arrays the following 
strategy for the generation of diversity is preferably 
adopted. A DNA sequence, for example, that is known 
to form a stable 3-dimensional structure is used as a 
starting point. This DNA sequence is amplified by PCR 
in a manner that results in the original sequence being 
mutated and therefore new 3-dimensional structures. 
These procedures are called error-prone PCR (Wang et 
al. J. Comput. Biol. 7, 143-158, 2000; Cherry et al. Nat. 
Biotechnol. 17, 333-334, 1999). Another strategy makes 
use of imitating natural recombination and DNA (exon) 
shuffling (Voikov and Arnold Methods Enzymol. 328, 
447-456, 2000; Petrounia and Arnold Curr. Opin. Bio- 
technol. 1 1 , 325-330, 2000; Kolkman and Stemmer Nat. 
Biotechnol. 19, 423-428, 2001; Minshull and Stemmer 
Curr. Opin. Chem. Biol. 3, 284-290, 1 999; Crameri et al. 
Nat. Med. 2, 100-102, 1 996). To introduce mutations the 
PCR can be starved of a required building block (one of 
the four dNTPs), which results in errors of the DNA 
polymerase, or by introducing a variant of a DNA build- 
ing block that can act as a substitute for several of the 
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bases. This wa^Bndom sequences starting from de- 
fined sequence are generated. Finally, the mutated PCR 
product is cloned into a vector, used fortransfection and 
grown on a culture. The different colonies are picked, 
5 DNA harvested and the inserts amplified by PCR. Each 
colony gives rise to a feature on the array. The PCR 
products are spotted onto a carrier in an arrayed struc- 
ture. 

[0064] RNA molecules can also be made by RT-PCR 

10 to the same end, for example, by a procedure called SE- 
LEX (Systematic Evolution of Ligands by Exponential 
enrichment, Tuerk and Gold, Science 249, 505-510, 
(1 990)). This procedure involves cycles of affinity selec- 
tion by a target molecule from a heterogeneous popu- 

15 lation of nucleic acids, replication of the bound species 
(the ligands), and in vitro transcription to generate an 
enriched pool of RNA. Binding RNAs are also termed 
aptamers, which have a defined 3-dimensional struc- 
ture. In parting from the strategy of generation of aptam- 

20 ers we accept the unselected library to generate the di- 
verse structures of our array. Similarly the molecules 
take unique and defined structures. As said aptamers 
are selected to accommodate a certain structure and 
thus be specific for a particular interaction partner. Here 

25 it is sought that the structures of the features are gen- 
erated to contain a degree of randomness. 
[0065] in the case of the array of features being made 
up of proteins an approach similar to Aff ibody (www.af- 
fibody.com) is applied, except that, in this case, it is pre- 

30 ferred if the structures are not selected in a certain di- 
rection, but randomly mutated away from a defined 
structure (Hansson et al,, Immunotechnology, 4, 
237-252 (1999); Gunneriusson et al., Protein Engineer- 
ing, 12, 872-878 (1 999)). By injection of a mouse with a 

35 protein a large amount of polyclonal antibodies is gen- 
erated in the animal. These antibodies can be isolated 
for use as a binding component of the respective pro- 
tein. No monoclonal antibodies are needed, as only 
semi-specific binding is desired. 

40 [0066] In the case of the later described sequence 
specific protease digestion of the captured protein, spe- 
cific digestion of capture features will also occur. The 
peaks resulting from the features are subtracted from 
the mass spectra for the characterization of the captured 

45 proteins. These peaks can also serve to assess the 
quality of the individual feature of the array or the quality 
of the reactions carried out on the feature. In the case 
of protein arrays the randomized sequences can also 
be cloned into an expression vector and allowed to ex- 

so press protein. The individual clones are spotted onto the 
carrier in an arrayed format. 

[0067] In a preferred embodiment, said features are 
immobilized on a surface made of materials such as 
glass, silicate, metal, metal-coated glass : glass-coated 
55 metal or plastic. The methods that lend themselves for 
immobilization are of the following: binding via NH 2 , I, 
SH, N-hydroxy-succinimide, biotin : His6 or other. The 
coating of the carrier material can be NH 2 , I, SH, N-hy- 
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droxy-succinimide, streptavidiTTHvli or any other specif i- 
cally interacting chemical group with the functionality of 
the feature. Immobilization can rely on covalent binding 
of the substrate on the surface or other strong interac- 
tion that can withstand subsequent reactions on the ar- 5 
ray. In another preferred embodiment the features are 
captured inside pores of a substrate. These substrates 
could be a membrane (Nylon, PVDF) or a gel pad (aga- 
rose, polyacrylamide). The features can be covalently 
bound to the porous material. This can be achieved by 10 
photochemical or chemical cross-linking. Alternatively, 
hydrophobic interaction can serve to immobilize the fea- 
tures. This support of the features has the advantage 
that, in contrast to the 2 -dimensional surface of for ex- 
ample a glass slide, it is more amenable to maintaining 15 
the 3-dimensional structure of the feature. Not bound 
features are removed by washing. 
[0068] An important and preferred characteristic of 
each of the arrayed features is that a specific protein or 
a specific group of proteins has affinity, preferably high 20 
affinity for it. Thus the entire protein extract is split into 
fractions (stratification) on the different features of the 
array. 

[0069] The present invention relates to the identifica- 
tion and quantification of the captured proteins on each 25 
feature and the use of proteins captured on several fea- 
tures to assess quantities and post-translational modi- 
fications. The final objective is to create an array capa- 
ble of capturing all proteins of the proteome of a cell on 
defined positions for comparative analysis of proteins, 30 
such as identities, post-translational modifications and 
quantities. 

[0070] Thus, the array that is being used in the inven- 
tion preferably comprises a number of diverse features 
high enough to bind the proteins in the extract. Different 35 
arrays may be used if the aim is to study specific proteins 
or specific types of proteins (nucleic acid-binding pro- 
teins, membrane proteins, antigen-binding proteins...). 
[0071] The identification (qualitative characterization) 
of the captured proteins is preferably achieved by a 40 
method, comprising the generation of a peptide finger- 
print of the proteins captured immediately on the feature 
by mass spectrometry. 

[0072] This is achieved by digesting captured proteins 
on the feature with a sequence specific protease, or a 45 
cleavage reagent and mass analyzing the peptide frag- 
ments. By comparison with databases the peptide mass 
fingerprint suffices to identify the proteins the peptide 
fragments originate from (trypsin is the protease most 
frequently for this purpose). Alternatively, CNBr cleav- so 
age, which cuts at methionine can be applied. Other re- 
agents include Lys-C, Arg-C, Asp-N, V8-bicarb, 
V8-phosp, chymotrypsin. 

[0073] As explained above, post-translational modifi- 
cations of the proteins can be identified by mass shift of 55 
expected peptide mass in the peptide mass fingerprints. 
[0074] In the case of a mixture of several proteins the 
compounded peptide mass fingerprint can be deconvo- 
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luted to identify the^PGteins contained in the mixture. 
The current state-of-the-art allows deconvoluting mix- 
tures of up to 20 proteins from peptide mass fingerprints. 
With the improvement of mass spectrometric methods 
this number is likely to increase. 

[0075] In the case of an array of features that are of 
protein nature, the feature will contribute peptides to the 
mass spectrometric analysis. They are subtracted from 
the mass list for database comparison, thus only leaving 
the remaining peaks as being attributed to captured pro- 
teins. 

[0076] In a preferred embodiment of the invention ma- 
trix-assisted laser desorption/ionization is used to trans- 
fer the peptides of a feature into the mass spectrometer. 
This can be done by directly introducing the array, thus 
not transferring the samples prepared on the array. An 
advantage of this is that there is no loss of material for 
the analysis and more importantly no selective loss. 
Typically, a matrix has to be added prior to the introduc- 
tion of the samples into the mass spectrometer. A pre- 
ferred matrix for this is a-cyano-4-hydroxy-cinnamic ac- 
id, but also other matrices could be used. State-of-the- 
art MALDI mass spectrometers allow sizing of desorp- 
tion products as well as the analysis of post-source-de- 
cay (the spontaneous fragmentation of the peptide 
bonds after the desorption process, which allows the de- 
termination of the amino acid sequence of the peptide). 
Analysis of post-source decay may be needed when a 
peptide fingerprint can not be assigned to a protein 
present in a database (for example if the protein is not 
known). Obtaining the sequence of the different pep- 
tides will allow to reconstitute the whole sequence of the 
protein. This may require performing two analysis, using 
two different proteases, in order to speed up the process 
of reconstituting the whole protein sequence. The per- 
son skilled in the art is aware of the techniques to use 
for applying mass spectrometry to obtain the sequence 
of an unknown protein. 

[0077] Alternatively, other mass spectrometers, like 
electrospray instruments can also be applied. For this 
the individual sample may have to be eluted from the 
feature and transferred into the mass spectrometer. An 
advantage of using this sort of mass spectrometer (es- 
pecially ESI) is that they provide significantly higher res- 
olution and that they are routinely coupled with sector 
analysis. Therefore, breaks can be actively introduced 
by bleeding a fragmentation gas into the mass spec- 
trometer. This provides an active rather than a passive 
means for peptide sequencing. 

[0078] The method of the invention is particularly per- 
formed with identification of the masses of the peptides 
by time-of-flight or magnetic field deviation analysis in 
an ion trap orquadrupol. 

[0079] As previously said, it is advantageously com- 
pleted by further characterization of the peptides in said 
digest mixture for sequence elements by analysis of 
post source decay products, if needed, in particular if 
the step of comparison of peptide masses to a protein 



8 



BNSDOCID: <EP 1319954A1_I_> 



5 p^ein 



EP 1 319 954 A1 



database to identify the]Plliins that were bound to the 
feature and to identify post-translational modifications 
of the proteins does not prove successful (for example 
Mowse and Mascot, op.cit). 

[0080] In a particularly preferred embodiment of this 
invention two or more protein extracts are compared to 
each other. In order to distinguish the two or more pro- 
tein extractions at least one of the protein extracts is 
subjected to a modification chemistry that results in de- 
fined modifications of chemical functions of the proteins. 
[0081 ] A preferred modification is such as not to alter 
the 3-dimensional structure of the protein. A way to 
achieve this may be by attaching a chemical group by 
trinitrobenzene sulfonic acid, ethylthiofluoro acetate, 
succinic anhydride, phenylisothiocyanate, Dansyl chlo- 
ride, acetic anhydride, polyethylene glycol, or similar re- 
agents to the (deprotected) N-terminal group of the pro- 
tein. As the N-termini of proteins are frequently blocked, 
it can be necessary to cleave off these blocking groups 
prior to N-terminal modification. Methods for this are de- 
scribed by Kamp and Hirano (Chapter 22, Proteomeand 
Protein Analysis, ed. Kamp, Springer Verlag, 2000). 
[0082] Different protein extracts would, for example, 
be tagged with different mass-shifting molecules. 
Thereafter comparing correctable N-terminal peptides, 
stemming from different protein extracts, the abundance 
of a particular protein in the different protein extracts is 
measured. 

[0083] For calibration of the entire array, the effects 
on proteins captured by different features are correlated 
with each other by a global analysis. This way one can 
determine which are the proteins which have changed, 
increased or decreased in different protein extractions. 
[0084] It is also possible to modify several of the pep- 
tides of a protein, for example by NH 2 -speciftc protein 
modification with trinitrobenzene sulfonic acid, ethylth- 
iofluoro acetate, succinic anhydride, phenylisothiocy- 
anate, Dansyl chloride, acetic anhydride, attachment of 
polyethylene glycol or similar reagents. Also SH-specific 
protein modifications with p-mercatoethanol, dithiothre- 
itol, iodoacetic acid, iodoacetamide or similar reagents 
is possible. For chemically modifying carboxyl groups of 
proteins full or limited amidation by 1 -ethyl-3-[3-(dimeth- 
ylamino)propyl]carboiimide hydrochloride (EDC) and an 
amine like glycine methyl ester, glycinamide, methyl- 
amine, ethanolamine or similar can be used. Tyrosines 
can be modified by nitration with tetranitromethane (also 
oxidation of thiols) and tryptophan can be modified by 
specific oxidation for example by N-bromosuccinimide 
or limited oxidation with ozone. Again different protein 
extracts are modified with similar compounds (same 
chemical functionalities, thus the chemical yield of reac- 
tions has less of an influence) but with different masses. 
Modifying compounds can be different by having addi- 
tional methyl groups. Modifying chemical groups can be 
isotopically pure, which results in less broad peaks in 
the peptide mass fingerprints. For tagging different ex- 
tracts similar chemical compounds with different isotope 
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composition areTmployed. By these modifications sev- 
eral of the peptides of each can be drawn into the quan- 
tification procedure. 

[0085] Protein mixes can either be stratified together 
5 or they can be applied to the array one by one. In this 
strategy the modification of each of the protein extrac- 
tion can be carried out after the capture. This way the 
effect of the modification on the 3-dimensional structure 
can be reduced. 
w [0086] The method of the invention is preferably ap- 
plied with an array that comprises multiple features al- 
lowing binding of the whole proteome of a target organ- 
ism or a target cell, allowing subsequent identification 
and/or quantification and/or characterization of proteins 
15 comprised in said proteome. - 

[0087] For obtaining the quantification, a known pro- 
tein (standard) may quantified by known methods, and 
the quantity of another target protein in the proteome is 
obtained by the relative intensity of the mass peaks of 
said target protein and said standard protein. 
[0088] By analyzing the quantitative and qualitative 
differences in protein expression between at least two 
samples, the method and the array of the invention may 
be used for various applications, amongst which: 

A method for identifying compounds that interact 
with a selected target protein, comprising the steps 
of: 

a) applying said protein on an array comprising 
different features, said features comprising dif- 
ferent compounds, 

b) selecting the compounds that interact with 
the selected target protein, by identifying the 
features to which is bound said protein. 

This method may be used in particular if the 
process of attaching the proteins on the array does 
not modify the 3-dimentional structure of the pro- 
teins. Rather than testing a multiplicity of com- 
pounds on a specific target, this method allows to 
test a multiplicity of compounds on a multiplicity of 
potential targets. 

A method for identifying proteins that are qualita- 
tively or quantitatively differentially expressed be- 
tween two different distinguishable phenotypes, (as 
defined above) comprising performing the method 
of the invention, on a first sample that is represent- 
ative of a first phenotype and a second sample that 
is representative of a second distinguishable phe- 
notype. In a specific embodiment, the first sample 
is harvested from a patient suffering of a disease, 
and the second sample is obtained from a patient 
who does not suffer from said disease. In a pre- 
55 ferred embodiment, said two distinguishable phe- 
notypes are tumor bearing patient and healthy pa- 
tient. 

A method for identifying proteins that are qualita- 
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tively or quantitatively differentially expressed fol- 
lowing treatment of a biological sample with a test 
compound, comprising performing the method of 
the invention on a first sample that is representative 
of said biological sample before treatment with said s 
compound, and a second sample that is represent- 
ative of said biological sample during or after treat- 
ment with said compound. 

A method of determining or assessing the therapeu- 
tic potential of a test compound with respect to a 10 
biological sample, comprising: 

a) performing the method as described above 
to said biological sample with a compound 
known to have therapeutic potential, in order to is 
identify the proteins that are qualitatively or 
quantitatively differentially expressed following 
treatment of a biological sample with a com- 
pound with therapeutic potential, 

b) applying said test compound to said biolog- 20 
ical sample, 

c) deducing the therapeutic potential of said 
test compound by the expression profile of the 
proteins in the sample and/or presence of said 
proteins identified in step a). 25 

A method of determining or assessing the respon- 
siveness of a target patient to a test or treatment, 
comprising: 

30 

a) performing the method described above to a 
biological sample issued from a reference pa- 
tient that is responsive to said test or treatment, 
in order to identify the proteins that are qualita- 
tively or quantitatively differentially expressed 35 
in response to said test or treatment, 

b) performing the method of the invention, al- 
lowing comparison between two samples, to bi- 
ological samples issued from said target patient 
before and after start of the test or treatment, 40 
deducing the responsiveness of said target pa- 
tient to said test or treatment by the presence 

of said proteins identified in step a). 

[0089] The invention described here will replace re- *s 
cent tools for proteomics for efficient study of protein ex- 
pression. This can be done for example in microorgan- 
isms with the ultimate goal to engineer improved pro- 
duction strains for fine chemicals. Another application 
will be the comparative study of the proteome of cells so 
with a normal phenotype and cells with an abnormal 
phenotype such as a disease. The gained information 
can be used to create targets for medication. The prin- 
ciple of the procedure would be similar for each kind of 
application. 55 
[0090] It is the aim to establish protein arrays covering 
the whole proteome of a cell or a microorganism. 
[0091 ] Therefore in a first step features are identified 
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by one of the metho^PBescribed above and positioned 
on a defined position on an array. For example, a large 
amount of features created by evolutional methods is 
displayed on an array. Binding of proteins on different 
features is analyzed by mass spectrometry. Features 
binding a certain number of proteins, which can be dis- 
tinguished by the peptide patterns after trypsin diges- 
tion, are selected for the final analytic array. When the 
proteins binding to some features are not know, a further 
analysis (sequencing) is perform to determine their na- 
ture. 

[0092] These chosen features address known pro- 
teins or proteins newly identified as is done with RNA 
arrays in order to reduce significantly the complexity of 
a protein mixture. It is optimal if approximately 1 0 differ- 
ent proteins bind per feature so that approximately 600 
features are estimated to be sufficient to study the pro- 
teome of E. co/L 

[0093] By the introduction of different mass tags, pro- 
teins originating from different samples can be distin- 
guished in relation with their provenience and the rela- 
tive quantity of said proteins can be analyzed by quan- 
tification of the height of mass spectrometric signals. 
[0094] Additionally post-translational modifications 
like phosphorylation of a protein, the major tool of a cell 
for regulation, can be distinguished. 

Description of the figures 

[0095] 

Figure 1 represents the labeling of two different pro- 
tein extracts with a different mass tag. 
Figure 2 represents the stratification (discriminative 
binding) of the proteins of features on an array that 
depends of the nature of the proteins. 

[0096] In Figure 3, mass analysis of the proteins on 
each feature is performed and the relative quantity of 
the proteins between the two samples is assessed by 
comparing the peak intensities. 

[0097] In Figure 4, mass spectrometry is performed 
after digest on the feature of the proteins. Only one pep- 
tide issued from the digest is labeled with a mass tag. 
The peptides that are not marked give the same peak 
for the proteins issued from the two samples, while a 
shift is observed for the labeled peptide, depending on 
the mass tag. Analysis of the intensity peak allows rel- 
ative quantification of the proteins between the two sam- 
ples. 

[0098] In Figure 5, the mass tags have been chosen 
such as being introduced in multiple sites in the proteins. 
On the four peptides obtained upon digest, two do not 
bear a mass tag, and they can not be separated by mass 
spectrometry, one does bear one mass tag, and sepa- 
ration occurs, and the last one bears two mass tags, and 
the discriminative gap is bigger than forthe one that only 
bears one mass tag. Again, analysis of the intensity 
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peak gives information jURTt the relative quantities of 
the proteins between the two samples. 
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array comprises a multiplicity of different features. 

The method of any of claims 1 to 5, wherein said 
features are localized on said array such as to allow 
the identification of the address of each feature. 



1. A method for identifying and/or quantifying and/or 
characterizing multiple proteins in a sample con- 
taining proteins, said method comprising: 

a) optionally labeling the proteins in the sample 
with a marker, 

b) bringing the proteins into contact with an ar- 
ray comprising one or more feature(s), leading 
to specific capture of different proteins on dif- 
ferent feature(s) on the array, 

c) applying, to the protein(s) captured on the 
feature(s) of the array, a procedure giving a fin- 
gerprint of the protein(s) on the feature(s), the 
comparison of the data obtained for the protein 
(s) on the feature(s) with a fingerprint library al- 
lowing identification and/or quantification and/ 
or characterization of the proteins present in 
the sample, including the post-translational 
modifications. 

2. A method for identifying proteins that are qualita- 
tively or quantitatively differentially expressed be- 
tween at least two biological samples containing 
proteins, said method comprising: 

a) labeling the proteins in each sample with a 
different marker, with the possibility of not labe- 
ling the proteins in one sample, 

b) mixing together the proteins of all different 
samples and bringing the mixture into contact 
with an array comprising one or more feature 
(s), leading to specific capture of different pro- 
teins on different feature(s) on the array, 

c) applying, to the proteins captured on the fea- 
ture^) of the array, a procedure allowing iden- 
tification/response of the markers, the differ- 
ences in the data obtained for each marker al- 
lowing the identification of the proteins that are 
qualitatively and/or quantitatively differentially 
expressed between the different samples. 

3. The method of claim 1 or 2, wherein said features 
are chosen in the group consisting of nucleic acids, 
oligonucleotides, oligopeptides, polypeptides, anti- 
bodies; oligosaccharides, polysaccharides, organic 
molecules, polymers, inorganic molecules, and 
combination thereof. 

4. The method of any of claims 1 to 3, wherein said 
features are immobilized on a surface. 

5. The method of any of claims 1 to 4, wherein said 
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7. The method of any of claim 1 to 6, wherein each 
feature specifically interacts with one protein 
present in the protein mixture. 

8. The method of any of claims 1 to 6, wherein at least 
one feature specifically interacts with more than one 
of the proteins present in the protein mixture, 
wherein said feature does not interact with all the 
proteins present in the protein mixture. 

9. The method of any of claims 1 to 8, wherein said 
labeling of the proteins consists in a chemical mod- 
ification of the proteins. 

10. The method of claim 9, wherein said chemical mod- 
ification does not alter the 3-dimensional structure 
of the protein. 



25 11 . The method of any of claims 9 to 10, wherein said 
chemical modification consists in attaching a chem- 
ical group chosen in the group consisting of trini- 
trobenzene sulfonic acid, ethylthiofluoro acetate, 
succinic anhydride, phenylisothiocyanate, Dansyl 

30 chloride, acetic anhydride, polyethylene glycol, and 
similar reagents to the (deprotected) N-terminal 
group of the protein. 

12. The method of any of claims 9 to 1 0, wherein said 
35 chemical modification consists in inducing SH-spe- 
cific protein modifications with an agent chosen in 
the group consisting of p-mercatoethanol, dithioth- 
reitol, iodoacetic acid : iodoacetamide, and the like. 

40 13. The method of any of claims 9 to 1 0, wherein said 
chemical modification consists in modifying car- 
boxyl groups of proteins by full or limited amidation 
unsing an agent chosen in the group consisting of 
1 -ethyl-3-[3-(dimethylamino)propyl]carboiimide hy- 

45 drochloride (EDC), an amine like glycine methyl es- 
ter, glycinamide, methylamine, ethanolamine and 
the like. 
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14. The method of any of claims 9 to 1 0, wherein said 
chemical modification consists in a modification of 
tyrosines by nitration with tetranitromethane (also 
oxidation of thiols), or of tryptophans by specific ox- 
idation, for example by N-bromosuccinimide, or by 
limited oxidation with ozone. 

15. The method of any of claims 1 to 14, wherein the 
procedure allowing identification/response of the 
marker(s) comprises adding a protease to the fea- 
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ture(s) of the array, givin^a digest mixture of the 
protein(s) that are localized on said feature. 

16. The method of claim 15, wherein said procedure 
comprises analysis of said digest mixture by mass 
spectrometry. 

17. The method of claim 16, wherein matrix- assisted la- 
ser desorption/ionization or electrospray/ionization 
is used to transfer the peptides into the mass spec- 
trometer. 

18. The method of claim 17, wherein a matrix or a sol- 
vent is added to each feature for matrix-assisted la- 
ser desorption/ionization or electrospray/ionization. 

19. The method of claim 16, wherein the peptides are 
transferred to an analysis array. 

20. The method of any of claims 16 to 19, wherein the 
masses of the peptides are identified by time-of- 
flight or magnetic field deviation analysis in an ion 
trap or quadrupol. 

21 . The method of any of claims 1 6 to 20, further com- 
prising characterization of the peptides in said di- 
gest mixture for sequence elements by analysis of 
post source decay products. 

22. The method of any of claims 1 6 to 21 , wherein pep- 
tide masses are compared to a protein database to 
identify the proteins that were bound to the feature 
and to identify post-translational modifications of 
the proteins. 

23. The method of claim 2, wherein the intensity of each 
marker is used for protein quantification between 
the two protein samples. 
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24. The method of claim 2, wherein exactly two different 
samples are assayed. 

25. The method of claim 2, wherein more than two dif- 
ferent samples are assayed. 

26. The method of claim 1 , wherein said array compris- 
es multiple features on allowing binding of the whole 
proteome of a target organism or a target cell, al- 
lowing subsequent identification and/or quantifica- 
tion and/or characterization of proteins comprised 
in said proteome. 

27. A method for identifying compounds that interact 
with a selected target protein, comprising the steps 
of: 

a) applying said protein to an array comprising 
different features, said features comprising dif- 



ferent com^ffhds, 
b) selecting the compounds that interact with 
the selected target protein, by identifying the 
features to which is bound said protein. 

5 

28. A method for identifying proteins that are qualita- 
tively or quantitatively differentially expressed be- 
tween two different distinguishable phenotypes, 
comprising the step of performing the method of 

10 claim 2, on a first sample that is representative of a 
first phenotype and a second sample that is repre- 
sentative of a second distinguishable phenotype. 

29. A method for identifying proteins that are qualita- 
tively or quantitatively differentially expressed fol- 
lowing treatment of a biological sample with a test 
compound, comprising the step of performing the 
method of claim 2 on a first sample that is repre- 
sentative of said biological sample before treatment 
with said compound, and a second sample that is 
representative of said biological sample during or 
after treatment with said compound. 
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30. A method of determining or assessing the therapeu- 
tic potential of a test compound with respect to a 
biological sample, comprising the steps of: 

a) performing the method of claim 29 to said 
biological sample with a compound known to 
have therapeutic potential, in order to identify 
the proteins that are qualitatively or quantita- 
tively differentially expressed following treat- 
ment of a biological sample with a compound 
with therapeutic potential, 

b) applying said test compound to said biolog- 
ical sample, 

c) deducing the therapeutic potential of said 
test compound by the presence of said proteins 
identified in step a). 

31. A method of determining or assessing the respon- 
siveness of a patient to a test or treatment, compris- 
ing the steps of: 

a) performing the method of claim 29 to a bio- 
logical sample issued from a patient that is re- 
sponsive to said test or treatment, in order to 
identify the proteins that are qualitatively or 
quantitatively differentially expressed in re- 
sponse to said test or treatment, 

b) performing the method of claim 2 to biologi- 
cal samples issued from said patient before and 
after start of the test or treatment, deducing the 
responsiveness of said patient to said test or 
treatment by the presence of said proteins iden- 
tified in step a). 
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